Marginal Distribution

Definition

The marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset.

For discrete variables x and y:

p (x) = \sum_{y} p (x, y) = \sum_{y} p (x | y) p (y)

which corresponds to the case where we marginalizes $y$ , i.e. to throw away $y$ and restrict the tension to $x$

For continuous variables x and y:

p (x) = \int_{y} p (x, y) d y = \int_{y} p (x | y) p (y) d y

Application of marginalisation

discrete form:

p (x | z) = \sum_{y} p (x, y | z) = \sum_{y} p (x | y, z) p (y | z)

continuous form:

p (x | z) = \int_{y} p (x | y, z) p (y | z) d y

when x and z are independent:

p (x | z) = \int_{y} p (x | y) p (y | z) d y

Predictive Distribution

For parameter $θ$ and observed data $Y$ , both prior $p (θ)$ and posterior $p (θ | Y)$ distributions focus on the parameter $θ$ . If you want to predictive new data $\hat{y}$ , you need predictive distribution of $\hat{y}$ using prior information (Prior Predictive Distribution) or observed data (Posterior Predictive Distribution).

Prior Predictive Distribution

with marginalisation of prior

p (y) = \int p (y | θ) p (θ) d θ

Posterior Predictive Distribution

p (\hat{y} | Y) = \int p (\hat{y} | θ, Y) p (θ | Y) d θ

if we assume new data $\hat{y}$ are independent from $Y$ :

p (\hat{y} | Y) = \int p (\hat{y} | θ) p (θ | Y) d θ